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Abstract — It was recently shown that spatial coupling of 
individual low-density parity-check codes improves the belief- 
propagation threshold of the coupled ensemble essentially to the 
maximum a posteriori threshold of the underlying ensemble. 
We study the performance of spatially coupled low-density 
generator-matrix ensembles when used for transmission over 
binary-input memoryless output-symmetric channels. We show 
by means of density evolution that the threshold saturation 
phenomenon also takes place in this setting. Our motivation 
for studying low-density generator-matrix codes is that they 
can easily be converted into rateless codes. Although there are 
already several classes of excellent rateless codes known to date, 
rateless codes constructed via spatial coupling might offer some 
additional advantages. In particular, by the very nature of the 
threshold phenomenon one expects that codes constructed on this 
principle can be made to be universal, i.e., a single construction 
can uniformly approach capacity over the class of binary- 
input memoryless output-symmetric channels. We discuss some 
necessary conditions on the degree distribution which universal 
rateless codes based on the threshold phenomenon have to fulfill. 
We then show by means of density evolution and some simulation 
results that indeed codes constructed in this way perform very 
well over a whole range of channel types and channel conditions. 



-H Index Terms— Spatial Coupling, LDGM, LDPC, LT codes, 
[/) Rate-less Codes, Raptor Codes, LDPC Convolutional Codes 
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I. Introduction 

THE idea of spatially coupling copies of a graphical model 
was introduced for the coding context in [JJ in the form 
of convolutional LDPC ensembles. The performance of such 
ensembles was investigated in, among others, and it was 
found to be very good. In particular the threshold of a coupled 
ensemble was consistently found to be significantly superior 
to the threshold of the underlying ensemble. It was then shown 
in ||5] |6l why this is the case, and the phenomena was termed 
threshold saturation. 

The key observation in the above papers is that the be- 
lief propagation (BP) threshold of the coupled ensemble is 
considerably improved and becomes close to the maximum a 
posteriori (MAP) threshold of the underlying ensemble while 
the MAP threshold of the coupled and underlying ensembles 
are close to each other This phenomenon has also been 
observed in several other classes of graphical models |7, 8| 
and seems to be rather general: when we spatially couple, 
the dynamical threshold of the chain converges to the static 
threshold of the un-coupled model. 

We study the coupling phenomenon for low-density 
generator-matrix (LDGM) codes. LDGM codes are closely 
related to LT codes. LT codes were originally designed for 
communication over the binary erasure channels (BEC) with 
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unknown erasure probability |9|. For these codes the encoder 
generates an (in principle) infinite sequence of output symbols. 
The decoder collects as many output symbols as necessary 
to successfully recover all the information bits. LT codes are 
one of the first instances of rateless codes, see ifTOl . They are 
called rateless codes because the rate of the code is not fixed 
a priori and can vary from essentially zero to essentially one, 
depending on the channel condition. 

A typical application of rateless codes is a system where the 
actual channel is unknown to the encoder and chosen from a 
given uncertainty set. LT codes can asymptotically reach 1 — /i 
of the capacity of the BEC with unknown erasure probability, 
for any fi > |9|. In particular, LT codes are universal 
over the BEC. By adding a proper precoder to the LT codes, 
Shokrollahi introduced Raptor codes which exhibit an even 
better performance in terms of encoding/decoding complexity 
and error probability ifTTI . There is a considerable literature 
on rateless codes. Let us just mention a very small selection 
and refer the reader to some of the review articles for a more 
thorough literature review. The error performance of Raptor 
codes and LT codes over binary-input memoryless output- 
symmetric (BMS) channels was investigated in |12|. Later in 
ifTOl . the authors showed how to design /i-capacity achieving 
Raptor codes, for arbitrary /i > 0, on the binary symmetric 
channel (BSC) and the binary additive white Gaussian noise 
channel (BAWGNC); the authors also proved that LT codes are 
not universal over the BSC and the BAWGN channel families. 

The objective of this paper is to introduce a further al- 
ternative to the construction of rateless codes, namely to 
construct rateless codes via spatial coupling of LT ensembles. 
We show by means of density evolution that the threshold 
saturation also takes place in this setting. We provide some 
necessary conditions on the degree distributions in order for 
the constructed ensemble to be universal. 

We describe the structure of coupled ensembles in Sec- 
tion [ll] There we also explain the relationship between LT and 
LDGM ensembles. The saturation phenomenon is investigated 



in Section 111 We derive some necessary conditions for such 



an ensemble to be universal in Section IV We also provide 



some simulation results for various channel types and rates 
which give further support for our conjecture. 

II. Rateless Ensembles from Coupled LT 
Ensembles 

We propose to construct rateless codes by spatially coupling 
LT codes. When the number of information bits and the 
number of output bits tends to infinity (at a fixed ratio), the 
performance of such a structure can in turn be assessed by 
analyzing an ensemble of spatially coupled LDGM codes. Let 
us start by recalling the definition of LT codes. 
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1) Structure of LT Ensemble: Let ui, . . . , Um denote the 
information bits we want to transmit. For LT codes, in prin- 
ciple, an infinite stream of output symbols is generated from 
these m information bits. The receiver "listens" to as many 
of them as needed in order to decode the m information bits 
reliably. 

More precisely, the encoder generates a sequence of output 
symbols as follows: First, an integer d, called the degree, 
is independently and randomly chosen according to a given 
degree distribution. This distribution is encoded by the poly- 
nomial R{x) = X]d=r Rd.x'^, where Rd is the probability of 
choosing d. Next, a d-tuple of information bits is uniformly 
picked from all (™) distinct d-tuples, denote it by (ii, . . . , id). 
Finally, the sum Ui-^ + • • • + is computed (also called 
the "output symbol") and it is transmitted over the channel. 
Here, we assumed that the transmitter and the receiver share 
randomness so that the choice of the degree as well as the 
choice of the indices is known both to the transmitted and to 
the receiver 

The receiver collects a number of output symbols (typically 
at least equal to the number of information bits) and starts the 
decoding process using the BP algorithm. If it cannot decode 
given this information, it collects further output symbols and 
retries. It continues in this manner until all m information bits 
are decoded. Assume that the receiver decodes all information 
bits using n output symbols. We then say that the code has 
rate r = ^. 

The received output symbols and information bits can be 
represented by a bipartite graph Q{U,G; E). Here U denotes 
the set of information nodes and it has cardinality m. In the 
same manner, G denotes the set of generator nodes (output 
symbols). The set E denotes all edges; there is an edge 
between a generator node and an information node iff the 
corresponding bit was used in the computation of this output 
symbol. 

2) Coupled LT Ensemble: Let us now discuss how to 
couple LT codes. Assume that the information bits are divided 
into L sets located at positions [0, L — 1] and each having m 
information bits. Let these bits be labeled from 1 to mL. Let 
the generator nodes be located at positions [0,i + w — 2], 
where w is a smoothing parameter, w € IN. To generate an 
output symbol, the encoder picks i e [0,i + w — 2]. This is 
the position of the next generator node which is being con- 
structed. Next the degree d is chosen as in the uncoupled case, 
according to the distribution R{x), and independently from all 
previous choices. Then, each of the d connections is uniformly 
and independently chosen among the mw information bits in 
the range [i — w + 1, «], see Fig.[r| For generator nodes situated 
close to the boundaries, if the position of a chosen information 
bit is not in the range [0, L — \\ then the associated edge is 
omitted. Equivalently we can assume that it is connected to a 
bit outside of this range which is known both to the transmitter 
and the receiver and whose value can without loss of generality 
be assumed to be 0. Finally, the encoder sends the sum of the 
values of the connected information bits. As for the uncoupled 
case, we assume that shared randomness is available at the 
transmitter and the receiver so that the choice of positions, 
degrees, and connected bits is known on both sides. We call 




^ L - 1 

w 

Fig. 1. Adding a new generator node to a coupled LDGM ensemble. 

the resulting ensemble a spatially coupled LT ensemble. 

3 ) LDGM Ensembles as Limits of LT Ensembles: Consider 
an uncoupled LT code. Since the degree of every generator 
node is chosen independently according to the distribution 
R{x), the empirical distribution of the degrees of the output 
symbols converges a.s. to R{x). Further, if we let m and 
n tend to infinity but fix their ratio, then the empirical 
degree distribution of the information bits converges a.s. to the 
Poisson distribution \{x) = e'"s(^^^\ where ^^vg = m^'i^) 
is the average degree of an information bit. Therefore, in this 
sense (for increasing blocklengths) the resulting code tends to 
an instance of the LDGM {e'"'""^^^^\ R{x)) ensemble. Note 
also that for any fixed number of iterations density evolution 
is continuous in the degree distribution. 

In order to study the threshold behavior of LT codes we 
can therefore study the threshold behavior of the equivalent 
LDGM ensemble. Only when we are interested in the finite- 
length scaling behavior do we need to take the small deviations 
of the degree distribution from the expected value into account. 
For coupled LT ensembles the same argument applies. There- 
fore, for the purpose of analysis, we consider L copies of 
(e'™e(^~^', i?(a;)) LDGM ensembles spatially coupled in the 
same way as described above. 

4) Design Rate: In the coupled setting, the design rate of 
the code is equal to the total number of non-trivial information 
bits, which is equal to mL, divided by the number of generator 
bits that are connected to at least one of the mL non-trivial 
information bits. We have. 

Lemma 1 (Design Rate). Consider an {X{x), R{x), L,w) 
coupled LDGM ensemble such that the underlying ensemble 
has n generator nodes and m information bits. The design 
rate is r — — ^ "^J^^-i, r-rv- 

We see that the design rate of the coupled ensemble is 
slightly decreased compared to the design rate m/n of the 
underlying ensemble. However, this rate loss vanishes with L 
at a speed of O(^). Hence, we should not pick L too small 
in order to keep the rate loss at an acceptable level. On the 
other hand, picking L very large leads to very long codes. 
Hence, there is an inherent trade-off. For LDPC ensembles, 
various ways of reducing the rate loss were suggested in 0. 
The same basic ideas can be applied in the present setting to 
substantially mitigate the rate loss. We will not pursue this 
topic further, although in a real setting it is important. 
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Fig. 2. Left: EBP GEXIT curve (dashed) and Maxwell curve (solid) of 
the (el2-32(a;-i)^^^(^)^ LDGM ensemble with Ri{x) = 0.02x + 0.6x2 + 
0.38x^^ and r = 0.5 for transmission over the BEC. The area under the 
Maxwell curve is r and the area threshold is 0.494. The BP threshold is 0.350. 
Right: The EBP GEXIT curves of the corresponding coupled ensembles for 
{L,w) equal to (64,5) (red curve), (128,5) (green curve), and (512,11) 
(blue curve). Note that these EBP GEXIT curves are all very close to the 
Maxwell curve of the underlying ensemble. 



III. Threshold Saturation of Coupled LDGM 
Ensemble 

Let us consider as example the EBP GEXIT curve of 
the (ei2-32(r!:-i)^i^^(a.)) LDGM ensemble where = 
0.02x + 0.6x2 ^_ 0.38x13 and r = ^Qln the left picture in 
Fig.|2]the EBP GEXIT curve is shown as a dashed line. For the 
given example it has the shape of an "S." Let us construct the 
Maxwell curve. We get the Maxwell curve by taking the EBP 
GEXIT curve and by cutting the "S" by means of a vertical 
line, where the line is located in such a way that the two gray 
areas are equal (see the left picture). The Maxwell curve then 
consists of the the vertical line plus the two connecting parts 
of the EBP GEXIT curve so that the total curve represents an 
increasing function. In the sequel we will refer to the entropy 
value where the vertical line is located as the area threshold 
(since the position of the vertical line is defined by an equality 
of areas). In Fig. [2] the Maxwell curve is shown as solid black 
curve and the area threshold is /i « 0.494. This has to be 
compared to the BP threshold of this ensemble which can 
be seen to be around 0.35. The significance of the Maxwell 
curve is that for a wide range of ensembles and channels it 
is conjectured to characterize the performance of the MAP 
decoder, see e.g., llT3l . 

Let us now show by means of DE computations that the 
BP threshold of the coupled ensemble is very close to the 
area threshold of the underlying ensemble for a wide range 
of BMS channels (see Fig. |2]). This observation suggests that 
the threshold saturation phenomenon also occurs in the current 
setting. 

The first step in the evaluation of the asymptotic perfor- 

'in a nutshell, the EBP GEXIT curve is the curve of all fixed points (FP) 
of density evolution (DE) for the given ensemble. For an in-depth discussion 
we refer the reader to 1131 . 

^Note that, even if we assume that the Maxwell curve characterizes the 
MAP performance, the area threshold defined above is not really the MAP 
threshold since there is an error floor (this code does not have a non-trivial 
MAP threshold). But if we assume, as it is e.g. the case for Raptor codes, that 
we are using a pre-code and that the eiTor floor is sufficiently small, then this 
area threshold has an important operational significance. As a consequence of 
the error floor, this area threshold can even be slightly larger than the Shannon 
threshold. 



mance is to write down the density evolution (DE) equations. 
To keep the notation at a manageable level, let us start with 
the case of the BEC. 

Lemma 2 (DE Equations). Consider a coupled {X{x) = 
gia„g(a;-i)^ L, w) LDGM ensemble and transmission over 

the BEC with erasure probability e. Let Xi, i € denote the 
average erasure probability which is emitted by information 
nodes at position i and yj , j G denote the average erasure 
probability which is emitted by generator nodes at position j. 
The fixed point (FP) condition implied by DE is then 

1 

Vj = l-{l-e)p{l-- ^^)' (2) 

i—j — w-\-l 

where p{x) = R'{x)/R'{l) and x, = Q for i <^ [Q, L - 1]. 

As mentioned before, since the information bits outside the 
interval [0, L — 1] are known, we can assume that xt ~ Q for 
i ^ [Q,L — 1]. The decoding process starts with x^^^^ = 1 for 
i e [0, L — 1]. Let x-'^ denote the average erasure probability 
which is emitted by information bits at position i at round I. 
If at each decoding round all x''-* are updated according to 
([TJ and (|2]l, then for each i the sequence xf ^ is monotonically 
decreasing. Since the sequence is bounded from below it must 
converge. Call the limit x|°°\ i G [0,L — 1]. We call the 
vector (xq°°', • • • ,x^°^'j^) the forward DE FP for the erasure 
probability e. From this we can compute the BP GEXIT value 
hj at the generator nodes in position j e [0, i + w — 2]. It is 
defined by, hj{xQ, ■ ■ ■ ,xl-i) = 1 - - ^ ^i)- 
The same analysis can be performed for general BMS channels 
by writing down the corresponding DE equations. Since this 
is rather routine, we skip this part. 

Let us now illustrate the results of the DE analysis. Consider 
the LDGM ensemble in Fig. |2] As discussed above, the left 
picture shows its EBP GEXIT curve as well as the derived 
Maxwell curve. The right picture shows the EBP GEXIT 
curves of the corresponding coupled ensembles with various 
values of L and w. As we can see from the picture, all these 
curves are very close to the Maxwell curve of the underlying 
ensemble and seem to approach the closer the larger we 
choose L and w (as long as w is small compared to L). So, 
according to this numerical evidence, the threshold saturation 
phenomenon occurs for this ensemble as conjectured. 

Fig. [3] shows the EBP GEXIT curves for several farther 
examples. All pictures are for the degree distribution R2{x) = 
0.360x2 0.313x3 + 0.327x22 and [L, w) equal to (32, 3) 
and (64,4). The rows correspond to transmission over the 
BEC, BSC, and BAWGNC (top to bottom), respectively. The 
columns corresponds to rates 0.2, 05, and 0.8 (left to right), 
respectively. For all these cases we see that the threshold satu- 
ration phenomenon takes place. Also, the resulting thresholds 
are all very close to the Shannon capacity. Indeed, we see that 
this code is uniformly good over these classes of channels 
and a wide range of rates. This gives further evidence to our 
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Fig. 3. EBP GEXIT curves of LDGM ensemble {X(x), R2{x)) (black curves) and the corresponding coupled ensembles for {L,w) equal to (32,3) (blue 
curves) and (64, 4) (green curves) where R2{x) = 0.360x^ + 0.313x''^ + 0.327x'^^. The ensembles are depicted for different rates 0.2, 0.5 and 0.8 in BEC, 
BSC and AWGN channel. For the rate 0.8, there is no BP threshold for the underlying LDGM ensembles. For all cases, the area threshold of individual 
ensembles is very close to the Shannon threshold. 



conjecture that rateless codes constracted on coupling of LT 
codes can be made to be universal. 

IV. Necessary Conditions on Degree Distribution 
FOR Universality of Coupled LDGM ensemble 

Although currently we do not know how to prove that 
coupled LT ensembles can be made universal, it is easy to 
derive some necessary conditions for this to happen. 

(i) Error Floor: Since we are dealing essentially with LDGM 
ensembles, our construction has generically a bit error 
floor To achieve capacity, we have to ensure that this 
error floor tends to zero as the block-length grows large. 
This induces a constraint on average degree of the gen- 
erator nodes, see [ITll^] 

(ii) Threshold Behavior: The premise of coupled ensembles 
is that their BP threshold is equal to their area threshold. 
Assuming the above premise, for a given generator degree 
distribution and a specific design rate, the corresponding 
coupled LDGM ensemble is therefore asymptotically 
(when L and w tend to infinity) capacity achieving for a 
family of channels if the area threshold of the underlying 
LDGM ensemble is equal to its Shannon threshold. If 
this property holds for any design rate, then we say that 

^^Here we assume that we want to construct our sequence of capacity 
achieving ensembles in such a way that the rate of the outer code tends 
to one. 



the coupled ensemble with that degree distribution is 
universal on that family of channels. If it holds in addition 
for any channel family (within lets say the BMS channel 
family) then we say that the ensemble is universally 
capacity achieving. 

We will see that this induces a constraint on Ri and i?2- 

A. Error Floor 

LDGM ensembles have in general a non-zero bit error 
probability below the "threshold" (which we call the error 
floor) and this error floor remains essentially unchanged by 
coupling. 

Theorem 1 (Lower Bound on Error Floor). The error floor 
of the {\{x) = e^"'-^'-^^^\ R{x)) LDGM ensemble when trans- 
mission takes place over the BEC with erasure probability e 
is lower bounded by 

Pe > ^A(6)(l + - - (3) 

Hence, a necessary condition for this expression to tend to 
zero at a fixed erasure probability e is that tends to 

infinity. 

B. Threshold Behavior 

It is shown in ifTOl that in order for a (e'^s'^^^^-', i?(a;)) 
LDGM ensemble (the asymptotic LT code with degree distri- 
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bution R{x) in their work) to be capacity achieving under BP 
decoding for a BMS channel with LLR density a, the following 
two conditions must be fulfilled: 

(i) i?i = 0, 

(ii) i?2 = 2D (a) ' where C(a) denoted the capacity of the 
channel and D{a) = J^^a{l)ta.nh{l/2)dl. 

Let us quickly review why these conditions are necessary. The 
first condition is due to the fact that if Ri > 0, the probability 
that an information bit is connected to more than one generator 
node with degree one is strictly positive. Imagine that e.g. 
two generator nodes of degree one are connected to the 
same information node. With positive probability both of them 
are received. But then clearly one of the generator nodes is 
redundant. Hence we are bounded away from capacity. 

To explain the second condition we will not follow the 
arguments used in |10| but rather use the language of EBP 
GEXIT curves. Assume that i?i is very close to zero. Let 
H{a) = 1 — C(a) be the entropy associated to the channel with 
density a. The stability condition of LDGM ensembles |13| 
implies that the entropy value where the EBP GEXIT curve 
deviates from 1 occurs at the point h = H{a) where D{a) — 
Here r is the design rate. 

If h > I - 7\ then the value of the EBP GEXIT curve at 
the Shannon threshold, 1 — r, is strictly smaller than one (see 
e.g. Fig. |2]l. Consequently, by applying the area theorem, the 
area threshold (and hence the BP threshold) must be strictly 
below the Shannon threshold. Therefore, to achieve capacity, 
we need that h < 1 — r. This implies that i?2 < ■ 

If h < 1 — r, then the EBP GEXIT curve deviates from 1 at 
h, which lies strictly below the Shannon threshold. Since the 
BP threshold cannot be greater than h, also in this case we 
cannot achieve capacity (see e.g. Fig. |3] recall that currently 
we discuss uncoupled ensembles). Therefore, we must have 
h> 1 — r. This implies that i?2 > 2D{1) ■ 

Combining these two conditions we see that we need 
equality for the uncoupled case. From this point of view it is 
immediately clear why in this framework we cannot construct 
universal codes - the right-hand side of the above equality 
depends on the channel! 

Consider now the coupled case. Condition (i) is still neces- 
sary. What about condition (ii)? It is easy to see that if the EBP 
GEXIT curve deviates from 1 to the right of the threshold, then 
also in the coupled case we are bounded away from capacity. 
So the condition h < 1 — r, or equivalently, R2 < ^jj^ still 
applies. But, due to the fact that the performance is now given 
by the Maxwell curve associated to the underlying ensembles, 
we are no longer bound by the second condition. Therefore, we 
might hope to find a value of R2 which fulfills the inequality 
i?2 < for all BMS channels. 

Lemma 3 (Minimum of Stability Condition). Over the class of 
densities a associated to BMS channels, inf, — tt-W, 

' ^ 2_D(a) 41n(2)' 

and the minimum is attained for the BSC with entropy \. 

Corollary 1 (Area Threshold of LDGM ensemble). In order 
for an [e^-'si^~^\ R{x)^ L, w) coupled LDGM ensemble to be 
asymptotically universal over all BMS channels, it is necessary 
that: (i) Ri = 0, (ii) R2 < « 0.3606. 



Let us look back at the example in Fig. |3] This degree 
distribution was designed according to Corollary [T] i.e., we 
have Ri = and R2 < 0.3606. Further, i?'(l) w 8.85. We 
can see that the GEXIT values of the underlying ensembles 
are 1 for h > 1 — r and for all tested channels. Due to a 
rather small value for R2, the BP threshold of the underlying 
ensemble is quite small (in particular for low rates) and so 
the uncoupled ensemble itself would not be useful. But as we 
have seen, the performance for the coupled case is universally 
close to the Shannon threshold. 

Let us summarize: Coupled LDGM codes have the potential 
advantage of being universal. This is indeed a nice property 
to have for typical applications of rateless codes. Further, 
since we are only concerned with the area threshold of the 
underlying ensemble, there are many more degrees of freedom 
in their design and typically only a small degree of irregularity 
suffices, making them potentially easier to implement. 

To be fair, there is a price to be payed. Due to the coupled 
structure and the fact that L has to be chosen reasonably large 
in order to avoid a large rate loss, it is difficult to construct 
codes of very short length which perform well. 
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